83 research outputs found

    Exploiting individual users and user groups interaction features: methodology and infrastructure design

    Get PDF
    Περιέχει το πλήρες κείμενοThe user may be a source of evidence for supporting infor- mation access through Digital Library (DL) systems. In particular, the features gathered while monitoring the interaction between the user and a DL system can be used as implicit indicators of the user interests. How- ever, each user has his own style of interaction and a feature which is a reliable indicator with regard to one user may be no longer reliable when referred to another user. This suggests the need to develop personalized approaches for each user which are tailored for each search task. Never- theless, the behavior of a group of interrelated users, e.g. performing the same task, may improve the contribution provided by the personal be- havior; for instance, some interaction features, if considered individually, are more reliable with regard to a group of users. This paper introduces a methodology for exploiting both the behavior of individual users and group of users as sources of evidence. The paper also introduces a soft- ware infrastructure implementing the methodology. The methodology is mainly based on a geometric framework while the software infrastructure is based on a partially decentralized Peer-To-Peer (P2P) network, thus permitting the management of di erent sources of evidence

    modeling the evolution of context in information retrieval

    Get PDF
    An Information Retrieval (IR) system ranks documents according to their predicted relevance to a formulated query. The prediction depends on the ranking algorithm adopted and on the assumptions about relevance underlying the algorithm. The main assumption is that there is one user, one information need for each query, one location where the user is, and no temporal dimension. But this assumption is unlikely: relevance is context-dependent. Exploiting the context in a way that does not require an high user effort may be effective in IR as suggested for example by Implicit Relevance Feedback techniques. The high number of factors to be considered by these techniques suggests the adoption of a theoretical framework which naturally incorporates multiple sources of evidence. Moreover, the information provided by the context might be a useful source of evidence in order to personalize the results returned to the user. Indeed, the information need arises and evolves in the present and past context of the user. Since the context changes in time, modeling the way in which the context evolves might contribute to achieve personalization. Starting from some recent reconsiderations of the geometry underlying IR and their contribution to modeling context, in this paper some issues which will be the starting point for my PhD research activity are discussed

    Improving Information Retrieval Effectiveness in Peer-to-Peer Networks through Query Piggybacking

    Get PDF
    Περιέχει το πλήρες κείμενοThis work describes an algorithm which aims at increasing the quantity of relevant documents retrieved from a Peer-To-Peer (P2P) network. The algorithm is based on a statistical model used for ranking documents, peers and ultra-peers, and on a “piggybacking” technique performed when the query is routed across the network. The algorithm “amplifies” the statistical information about the neighborhood stored in each ultra-peer. The preliminary experiments provided encouraging results as the quantity of relevant documents retrieved through the network almost doubles once query piggybacking is exploited

    Tracking biomedicalization in the media: Public discourses on health and medicine in the UK and Italy, 1984–2017

    Get PDF
    This article examines historical trends in the reporting of health, illness and medicine in UK and Italian newspapers from 1984 to 2017. It focuses on the increasing “biomedicalization” of health reporting and the framing of health and medicine as a matter of technoscientific interventions. Methodologically, we relied on two large datasets consisting of all the health- and medicine-related articles published in the online archives of The Guardian (UK) and la Repubblica (Italy). These articles underwent a quantitative analysis, based on topic modelling techniques, to identify and analyse relevant topics in the datasets. Moreover, we developed some synthetic indices to support the analysis of how medical and health news are “biomedicalized” in media coverage. Theoretically, we emphasise that media represent a constitutive environment in shaping biomedicalization processes. Our analyses show that across the period under scrutiny, biomedicalization is a relevant, even if sometimes ambivalent, frame in the media sphere, placing growing centrality on three dimensions: i) health and well-being as a matter of individual commitment to self-monitoring and self-surveillance; ii) biomedicine as a large technoscientific enterprise emerging from the entanglement between research fields and their technological embodiments; iii) the multiverse reforms of welfare systems in facing the trade-off between universal health coverage and the need to render the national healthcare system more sustainable and compatible with non-expansionary monetary policies and austerity approaches in managing state government budgets

    Lucene4IR: Developing information retrieval evaluation resources using Lucene

    Get PDF
    The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources

    EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

    Get PDF
    Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)

    Design, Implementation and Evaluation of a Methodology for Utilizing Sources of Evidence in Relevance Feedback

    Get PDF
    The objective of an Information Retrieval system is to support the user when he searches for information by predicting the documents relevant to his information need. Prediction is performed on the basis of evidence available during the search process. User interactions are examples of sources from which this evidence can be gathered. This thesis addresses the problem of uniformly modeling heterogeneous forms of user interaction that are selected as sources for feedback. The problem of uniform source modeling is addressed by way of a complete methodology. The methodology aims at designing, implementing and evaluating a system that validates an experimental hypothesis. The hypothesis being validated regards the possible factors that can explain the user perception of relevance through the evidence gathered from the user interaction. The objective is to obtain and exploit a usable representation of the factors in the role of a new dimension of the information need representation. The methodology aims at being general and not tailored to a specific source. The methodology defines the set of steps needed for obtaining a vector subspace-based representation of the information need dimensions to further exploit this representation for relevance prediction purposes. The set of steps identified are source selection, evidence collection, dimension modeling, document modeling and prediction. This thesis shows how the methodology can be used for modeling two sources of evidence: term relationship in documents judged as relevant and the relationship between interaction features gathered from the behavior of the user when interacting with a set of documents. As for the term relationship dimension, this thesis shows that the current implementation of term relationship is feasible with a very large text collection delivered within the 2009 and 2010 Relevance Feedback tracks of the Text Retrieval Conference initiative. The methodology has supported the evaluation of term relationship for document re-ranking. As for interaction feature relationships, this thesis investigates the adoption of the user behavior dimension for document re-ranking both without query expansion and with query expansion.L'obiettivo di un sistema di reperimento dell'informazione è quello di supportare l'utente in cerca di informazioni predicendo quali documenti siano rilevanti per la sua esigenza informativa. La predizione di rilevanza è effettuata sulla base dell'evidenza disponibile durante il processo di reperimento. Le interazioni che coivolgono l'utente sono esempi di sorgenti di evidenza. Questa tesi affronta il problema della modellazione uniforme di forme eterogenee di interazione utilizzate come sorgenti di retroazione. Il problema della modellazione uniforme delle sorgenti è affrontato mediante l'introduzione di una metodologia, finalizzata alla progettazione, la realizzazione e la valutazione di un sistema per validare ipotesi sperimentali. Le ipotesi riguardano i possibili fattori che possano spiegare la percezione di rilevanza dell'utente sulla base dell'evidenza ottenuta da interazioni che coinvolgano l'utente stesso. L'obiettivo è quello di ottenere una rappresentazione dei fattori che possa essere utilizzata come una nuova dimensione della rappresentazione dell'esigenza informativa. La metodologia si propone di essere generale e non specifica per una particolare sorgente. Essa definisce una serie di passi necessari per ottenere una rappresentazione in termini di sottospazi delle dimensioni della rappresentazione dell'esigenza informativa per poi utilizzare tale rappresentazione al fine della predizione. La tesi applica la metodologia per modellare due sorgenti di evidenza: le relazioni tra i termini nei documenti giudicati rilevanti e la relazione tra attributi utilizzati per caratterizzare il comportamento dell'utente durante l'interazione con i documenti. In merito alla relazione tra i termini questa tesi mostra come la attuale implementazione per questa sorgente possa essere utilizzata per effettuare il reperimento su collezioni molto ampie, in particolare quelle adottate nelle campagne di valutazione dell'iniziativa Text Retrieval Conference, nello specifico nelle track di Relevance Feedback tenutesi nel 2009 e nel 2010. La metodologia ha consentito di supportare la valutazione del riordinamento dei documenti basato sulle relazioni tra i termini. In merito alle relazioni tra attributi per caratterizzare il comportamento dell'utente questa tesi investiga l'utilizzo di una dimensione basata su tale sorgente per effettuare un riordinamento dei documenti sia unicamente basato sul comportamento, sia mediante espansione dell'interrogazione
    corecore